Goto

Collaborating Authors

 lateral connection


Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks

Dina Obeid, Hugo Ramambason, Cengiz Pehlevan

Neural Information Processing Systems

Synaptic plasticity is widely accepted to be the mechanism behind learning in the brain's neural networks. A central question is how synapses, with access to only local information about the network, can still organize collectively and perform circuit-wide learning in an efficient manner.


We thank the reviewers for their careful reading of our work and for their helpful comments

Neural Information Processing Systems

We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.



Incorporating Visual Cortical Lateral Connection Properties into CNN: Recurrent Activation and Excitatory-Inhibitory Separation

Park, Jin Hyun, Zhang, Cheng, Choe, Yoonsuck

arXiv.org Artificial Intelligence

Biologically motivated neural networks for visual processing such as Neocognitron [Fukushima, 1980], Convolutional Neural Networks (CNNs) [LeCun et al., 1989], and HMAX [Riesenhuber and Poggio, 1999], drew inspiration from Hubel and Wiesel's works on the primary visual cortical neurons [Hubel and Wiesel, 1959] and subsequent developments in the field. A common feature in these models is that the alternating layers of simple cells and complex cells form a feed-forward hierarchy, starting with the afferent connections from the input (for CNN, the convolutional layers and pooling layers may serve the same purpose [Lindsay, 2021]). The hierarchy in these models loosely mimic the projections among different cortical areas in the visual pathway [Felleman and V an Essen, 1991], where each convolutional layer correspond to a distinct visual cortical area, and the connections serving as the long-range projections. Functionally, feature representations in CNN also seem to show close similarity to those in the ventral visual pathway [Zeiler, 2014]. There is a major shortcoming in this, since the various visual cortical areas do not form a strict hierarchy, as there are feedback connections between the visual areas forming a recurrent loop [Briggs, 2020]. Some architectural features in modern CNN variants may serve this purpose. For instance, Liao and Poggio [Liao and Poggio, 2016] proposed that skipped connections in the ResNet [He et al., 2016] can be seen as implementing such recurrent projections (also see Recurrent CNN: [Liang and Hu, 2015]).


We thank the reviewers for their careful reading of our work and for their helpful comments

Neural Information Processing Systems

We thank the reviewers for their careful reading of our work and for their helpful comments. We will also clarify that the text in sections 2.1 and 2.2 In terms of experimental predictions, our work predicts the synaptic weights in the SFA circuit. One mechanism for implementing a quadratic expansion are so-called "Sigma-Pi units" (Rumelhart, Hinton and (Mel and Koch, 1990). In this case, the derivation proceeds exactly as laid out in the paper. Thank you for pointing out the typos.




Early prediction of the transferability of bovine embryos from videomicroscopy

Hachani, Yasmine, Bouthemy, Patrick, Fromont, Elisa, Ruffini, Sylvie, Laffont, Ludivine, Reis, Alline de Paula

arXiv.org Artificial Intelligence

Videomicroscopy is a promising tool combined with machine learning for studying the early development of in vitro fertilized bovine embryos and assessing its transferability as soon as possible. We aim to predict the embryo transferability within four days at most, taking 2D time-lapse microscopy videos as input. We formulate this problem as a supervised binary classification problem for the classes transferable and not transferable. The challenges are three-fold: 1) poorly discriminating appearance and motion, 2) class ambiguity, 3) small amount of annotated data. We propose a 3D convolutional neural network involving three pathways, which makes it multi-scale in time and able to handle appearance and motion in different ways. For training, we retain the focal loss. Our model, named SFR, compares favorably to other methods. Experiments demonstrate its effectiveness and accuracy for our challenging biological task.


Linked Adapters: Linking Past and Future to Present for Effective Continual Learning

Chandra, Dupati Srikar, Srijith, P. K., Rezazadegan, Dana, McCarthy, Chris

arXiv.org Artificial Intelligence

Continual learning allows the system to learn and adapt to new tasks while retaining the knowledge acquired from previous tasks. However, deep learning models suffer from catastrophic forgetting of knowledge learned from earlier tasks while learning a new task. Moreover, retraining large models like transformers from scratch for every new task is costly. An effective approach to address continual learning is to use a large pre-trained model with task-specific adapters to adapt to the new tasks. Though this approach can mitigate catastrophic forgetting, they fail to transfer knowledge across tasks as each task is learning adapters separately. To address this, we propose a novel approach Linked Adapters that allows knowledge transfer through a weighted attention mechanism to other task-specific adapters. Linked adapters use a multi-layer perceptron (MLP) to model the attention weights, which overcomes the challenge of backward knowledge transfer in continual learning in addition to modeling the forward knowledge transfer. During inference, our proposed approach effectively leverages knowledge transfer through MLP-based attention weights across all the lateral task adapters. Through numerous experiments conducted on diverse image classification datasets, we effectively demonstrated the improvement in performance on the continual learning tasks using Linked Adapters.


Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation

Hafez, Muhammad Burhan, Erekmen, Kerim

arXiv.org Artificial Intelligence

Central to the development of universal learning systems is the ability to solve multiple tasks without retraining from scratch when new data arrives. This is crucial because each task requires significant training time. Addressing the problem of continual learning necessitates various methods due to the complexity of the problem space. This problem space includes: (1) addressing catastrophic forgetting to retain previously learned tasks, (2) demonstrating positive forward transfer for faster learning, (3) ensuring scalability across numerous tasks, and (4) facilitating learning without requiring task labels, even in the absence of clear task boundaries. In this paper, the Task-Agnostic Policy Distillation (TAPD) framework is introduced. This framework alleviates problems (1)-(4) by incorporating a task-agnostic phase, where an agent explores its environment without any external goal and maximizes only its intrinsic motivation. The knowledge gained during this phase is later distilled for further exploration. Therefore, the agent acts in a self-supervised manner by systematically seeking novel states. By utilizing task-agnostic distilled knowledge, the agent can solve downstream tasks more efficiently, leading to improved sample efficiency. Our code is available at the repository: https://github.com/wabbajack1/TAPD.